Model Selection

GQA Efficient Inference

# GQA Efficient Inference

Llama 3.1 Minitron 4B Width Base

Llama-3.1-Minitron-4B-Width-Base is a foundational text-to-text model obtained by pruning Llama-3.1-8B, suitable for various natural language generation tasks.

Large Language Model

Transformers English

Minitron 8B Base

Minitron-8B-Base is a large language model obtained by pruning Nemotron-4 15B, employing distillation and continuous training methods, saving 40 times the training tokens and 1.8 times the computational cost compared to training from scratch.

Large Language Model

Transformers English

Meta Llama 3.1 is a series of multilingual large language models, including 8B, 70B, and 405B pre-trained and instruction-tuned generative models, optimized for multilingual dialogue scenarios.

Large Language Model

Transformers Supports Multiple Languages

Meta Llama 3 70B

Meta's Llama 3 series of large language models, including 8B and 70B scale pre-trained and instruction-tuned generative text models, optimized for dialogue scenarios, with excellent performance in industry benchmark tests.

Large Language Model

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase